問題描述
如何檢查我是否只刪除了所需的數據? (How do I check that I removed required data only?)
我有一個非常大的數據庫(在 PostgreSQL 上運行),其中包含許多具有復雜關係的表(外鍵、刪除級聯等)。我需要從多個表中刪除一些數據,但我不確定由於級聯刪除而真正從數據庫中刪除的數據量。
如何檢查我不會刪除數據那不應該被刪除嗎?
我有一個測試數據庫 ‑ 只是一個真實數據庫的副本,我可以在其中做我想做的事情:)
我唯一的想法是轉儲數據庫之前然後檢查它。但是看起來不太舒服。另一個想法 ‑ 轉儲部分數據庫,正如我認為的那樣,它不應該受到我的 DELETE 語句的影響,並在數據刪除前後檢查這部分。但我認為沒有簡單的方法可以做到這一點(有數百個表,刪除應該與其中的 10 個一起使用)。有什麼辦法嗎?
還有什麼其他的辦法可以解決這個問題嗎?
參考解法
方法 1:
You can query the information_schema to draw yourself a picture on how the constraints are defined in the database. Then you'll know what is going to happen when you delete. This will be useful not only for this case, but always.
Something like (for constraints)
select table_catalog,table_schema,table_name,column_name,rc.* from
information_schema.constraint_column_usage ccu,
information_schema.referential_constraints rc
where ccu.constraint_name = rc.constraint_name
方法 2:
Using psql, start a transaction, perform your deletes, then run whatever checking queries you can think of. You can then either rollback or commit.
方法 3:
If the worry is keys left dangling (i.e.: pointing to a deleted record) then run the deletion on your test database, then use queries to find any keys that now point to invalid targets. (while you're doing this you can also make sure the part that should be unaffected did not change)
A better solution would be to spend time mapping out the delete cascades so you know what to expect ‑ knowing how your database works is pretty valuable so the effort spent on this will be useful beyond this particular deletion.
And no matter how sure you are back the DB up before doing big changes!
方法 4:
Thanks for answers!
Vinko, your answer is very useful for me and I'll study it dipper.
actually, for my case, it was enough to compare tables counts before and after records deletion and check what tables were affected by it.
it was done by simple commands described below
psql ‑U U_NAME ‑h`hostname` ‑c '\d' | awk '{print $3}' > tables.list
for i in `cat tables.list `; do echo ‑n "$i: " >> tables.counts; psql ‑U U_NAME ‑h`hostname` ‑t ‑c "select count(*) from $i" >> tables.counts; done
for i in `cat tables.list `; do echo ‑n "$i: " >> tables.counts2; psql ‑U U_NAME ‑h`hostname` ‑t ‑c "select count(*) from $i" >> tables.counts2; done
diff tables.counts tables.counts2
(by Pavel、Vinko Vrsalovic、Stephen Denne、DrStalker、Pavel)